Overview

Dataset statistics

Number of variables22
Number of observations21613
Missing cells320
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.6 MiB
Average record size in memory176.0 B

Variable types

DateTime1
Numeric14
Categorical7

Alerts

price is highly correlated with sqft_living and 3 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 2 other fieldsHigh correlation
bathrooms is highly correlated with bedrooms and 6 other fieldsHigh correlation
sqft_living is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with bathrooms and 3 other fieldsHigh correlation
grade is highly correlated with price and 6 other fieldsHigh correlation
sqft_above is highly correlated with price and 6 other fieldsHigh correlation
yr_built is highly correlated with bathrooms and 2 other fieldsHigh correlation
sqft_living15 is highly correlated with price and 4 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
price is highly correlated with bathrooms and 4 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 1 other fieldsHigh correlation
bathrooms is highly correlated with price and 7 other fieldsHigh correlation
sqft_living is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with bathrooms and 1 other fieldsHigh correlation
grade is highly correlated with price and 4 other fieldsHigh correlation
sqft_above is highly correlated with price and 5 other fieldsHigh correlation
yr_built is highly correlated with bathroomsHigh correlation
sqft_living15 is highly correlated with price and 4 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
price is highly correlated with gradeHigh correlation
bedrooms is highly correlated with sqft_livingHigh correlation
bathrooms is highly correlated with sqft_living and 2 other fieldsHigh correlation
sqft_living is highly correlated with bedrooms and 4 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
grade is highly correlated with price and 4 other fieldsHigh correlation
sqft_above is highly correlated with bathrooms and 3 other fieldsHigh correlation
sqft_living15 is highly correlated with sqft_living and 2 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
condition is highly correlated with condition_typeHigh correlation
view is highly correlated with waterfrontHigh correlation
condition_type is highly correlated with conditionHigh correlation
waterfront is highly correlated with viewHigh correlation
price is highly correlated with bathrooms and 6 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 2 other fieldsHigh correlation
bathrooms is highly correlated with price and 9 other fieldsHigh correlation
sqft_living is highly correlated with price and 7 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with yr_built and 1 other fieldsHigh correlation
condition is highly correlated with yr_built and 1 other fieldsHigh correlation
grade is highly correlated with price and 5 other fieldsHigh correlation
sqft_above is highly correlated with price and 7 other fieldsHigh correlation
sqft_basement is highly correlated with price and 3 other fieldsHigh correlation
yr_built is highly correlated with bathrooms and 4 other fieldsHigh correlation
lat is highly correlated with price_tierHigh correlation
long is highly correlated with yr_builtHigh correlation
sqft_living15 is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
house_age is highly correlated with bathrooms and 3 other fieldsHigh correlation
dormitory_type is highly correlated with bathroomsHigh correlation
condition_type is highly correlated with conditionHigh correlation
price_tier is highly correlated with price and 4 other fieldsHigh correlation
house_age has 320 (1.5%) missing values Missing
sqft_basement has 13126 (60.7%) zeros Zeros

Reproduction

Analysis started2022-04-30 15:08:16.557314
Analysis finished2022-04-30 15:09:07.691707
Duration51.13 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

date
Date

Distinct372
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
Minimum2014-05-02 00:00:00
Maximum2015-05-27 00:00:00
2022-04-30T10:09:07.887711image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:08.189710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4028
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean540088.1418
Minimum75000
Maximum7700000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:08.563710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum75000
5-th percentile210000
Q1321950
median450000
Q3645000
95-th percentile1156480
Maximum7700000
Range7625000
Interquartile range (IQR)323050

Descriptive statistics

Standard deviation367127.1965
Coefficient of variation (CV)0.6797542255
Kurtosis34.58554043
Mean540088.1418
Median Absolute Deviation (MAD)150000
Skewness4.024069145
Sum1.167292501 × 1010
Variance1.347823784 × 1011
MonotonicityNot monotonic
2022-04-30T10:09:08.966710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
350000172
 
0.8%
450000172
 
0.8%
550000159
 
0.7%
500000152
 
0.7%
425000150
 
0.7%
325000148
 
0.7%
400000145
 
0.7%
375000138
 
0.6%
300000133
 
0.6%
525000131
 
0.6%
Other values (4018)20113
93.1%
ValueCountFrequency (%)
750001
< 0.1%
780001
< 0.1%
800001
< 0.1%
810001
< 0.1%
820001
< 0.1%
825001
< 0.1%
830001
< 0.1%
840001
< 0.1%
850002
< 0.1%
865001
< 0.1%
ValueCountFrequency (%)
77000001
< 0.1%
70625001
< 0.1%
68850001
< 0.1%
55700001
< 0.1%
53500001
< 0.1%
53000001
< 0.1%
51108001
< 0.1%
46680001
< 0.1%
45000001
< 0.1%
44890001
< 0.1%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.370841623
Minimum0
Maximum33
Zeros13
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:09.301719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum33
Range33
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9300618311
Coefficient of variation (CV)0.2759138325
Kurtosis49.06365318
Mean3.370841623
Median Absolute Deviation (MAD)1
Skewness1.974299535
Sum72854
Variance0.8650150098
MonotonicityNot monotonic
2022-04-30T10:09:09.543715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
39824
45.5%
46882
31.8%
22760
 
12.8%
51601
 
7.4%
6272
 
1.3%
1199
 
0.9%
738
 
0.2%
013
 
0.1%
813
 
0.1%
96
 
< 0.1%
Other values (3)5
 
< 0.1%
ValueCountFrequency (%)
013
 
0.1%
1199
 
0.9%
22760
 
12.8%
39824
45.5%
46882
31.8%
51601
 
7.4%
6272
 
1.3%
738
 
0.2%
813
 
0.1%
96
 
< 0.1%
ValueCountFrequency (%)
331
 
< 0.1%
111
 
< 0.1%
103
 
< 0.1%
96
 
< 0.1%
813
 
0.1%
738
 
0.2%
6272
 
1.3%
51601
 
7.4%
46882
31.8%
39824
45.5%

bathrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.114757322
Minimum0
Maximum8
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:09.788710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11.75
median2.25
Q32.5
95-th percentile3.5
Maximum8
Range8
Interquartile range (IQR)0.75

Descriptive statistics

Standard deviation0.7701631572
Coefficient of variation (CV)0.3641851238
Kurtosis1.279902444
Mean2.114757322
Median Absolute Deviation (MAD)0.5
Skewness0.5111075733
Sum45706.25
Variance0.5931512887
MonotonicityNot monotonic
2022-04-30T10:09:09.991709image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2.55380
24.9%
13852
17.8%
1.753048
14.1%
2.252047
 
9.5%
21930
 
8.9%
1.51446
 
6.7%
2.751185
 
5.5%
3753
 
3.5%
3.5731
 
3.4%
3.25589
 
2.7%
Other values (20)652
 
3.0%
ValueCountFrequency (%)
010
 
< 0.1%
0.54
 
< 0.1%
0.7572
 
0.3%
13852
17.8%
1.259
 
< 0.1%
1.51446
 
6.7%
1.753048
14.1%
21930
 
8.9%
2.252047
 
9.5%
2.55380
24.9%
ValueCountFrequency (%)
82
 
< 0.1%
7.751
 
< 0.1%
7.51
 
< 0.1%
6.752
 
< 0.1%
6.52
 
< 0.1%
6.252
 
< 0.1%
66
< 0.1%
5.754
 
< 0.1%
5.510
< 0.1%
5.2513
0.1%

sqft_living
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1038
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2079.899736
Minimum290
Maximum13540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:10.192708image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile940
Q11427
median1910
Q32550
95-th percentile3760
Maximum13540
Range13250
Interquartile range (IQR)1123

Descriptive statistics

Standard deviation918.440897
Coefficient of variation (CV)0.4415794093
Kurtosis5.24309299
Mean2079.899736
Median Absolute Deviation (MAD)540
Skewness1.471555427
Sum44952873
Variance843533.6814
MonotonicityNot monotonic
2022-04-30T10:09:10.438714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1300138
 
0.6%
1400135
 
0.6%
1440133
 
0.6%
1800129
 
0.6%
1660129
 
0.6%
1010129
 
0.6%
1820128
 
0.6%
1480125
 
0.6%
1720125
 
0.6%
1540124
 
0.6%
Other values (1028)20318
94.0%
ValueCountFrequency (%)
2901
< 0.1%
3701
< 0.1%
3801
< 0.1%
3841
< 0.1%
3902
< 0.1%
4101
< 0.1%
4202
< 0.1%
4301
< 0.1%
4401
< 0.1%
4601
< 0.1%
ValueCountFrequency (%)
135401
< 0.1%
120501
< 0.1%
100401
< 0.1%
98901
< 0.1%
96401
< 0.1%
92001
< 0.1%
86701
< 0.1%
80201
< 0.1%
80101
< 0.1%
80001
< 0.1%

sqft_lot
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9782
Distinct (%)45.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15106.96757
Minimum520
Maximum1651359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:10.667345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum520
5-th percentile1800
Q15040
median7618
Q310688
95-th percentile43339.2
Maximum1651359
Range1650839
Interquartile range (IQR)5648

Descriptive statistics

Standard deviation41420.51152
Coefficient of variation (CV)2.741815082
Kurtosis285.0778197
Mean15106.96757
Median Absolute Deviation (MAD)2618
Skewness13.06001896
Sum326506890
Variance1715658774
MonotonicityNot monotonic
2022-04-30T10:09:10.892342image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000358
 
1.7%
6000290
 
1.3%
4000251
 
1.2%
7200220
 
1.0%
4800120
 
0.6%
7500119
 
0.6%
4500114
 
0.5%
8400111
 
0.5%
9600109
 
0.5%
3600103
 
0.5%
Other values (9772)19818
91.7%
ValueCountFrequency (%)
5201
< 0.1%
5721
< 0.1%
6001
< 0.1%
6091
< 0.1%
6351
< 0.1%
6381
< 0.1%
6492
< 0.1%
6511
< 0.1%
6751
< 0.1%
6761
< 0.1%
ValueCountFrequency (%)
16513591
< 0.1%
11647941
< 0.1%
10742181
< 0.1%
10240681
< 0.1%
9829981
< 0.1%
9822781
< 0.1%
9204231
< 0.1%
8816541
< 0.1%
8712002
< 0.1%
8433091
< 0.1%

floors
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.494308981
Minimum1
Maximum3.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:11.088345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1.5
Q32
95-th percentile2
Maximum3.5
Range2.5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5399888951
Coefficient of variation (CV)0.361363615
Kurtosis-0.4847229368
Mean1.494308981
Median Absolute Deviation (MAD)0.5
Skewness0.6161767212
Sum32296.5
Variance0.2915880069
MonotonicityNot monotonic
2022-04-30T10:09:11.232207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
110680
49.4%
28241
38.1%
1.51910
 
8.8%
3613
 
2.8%
2.5161
 
0.7%
3.58
 
< 0.1%
ValueCountFrequency (%)
110680
49.4%
1.51910
 
8.8%
28241
38.1%
2.5161
 
0.7%
3613
 
2.8%
3.58
 
< 0.1%
ValueCountFrequency (%)
3.58
 
< 0.1%
3613
 
2.8%
2.5161
 
0.7%
28241
38.1%
1.51910
 
8.8%
110680
49.4%

waterfront
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
0
21450 
1
 
163

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
021450
99.2%
1163
 
0.8%

Length

2022-04-30T10:09:11.402207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:11.498247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
021450
99.2%
1163
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

view
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
0
19489 
2
 
963
3
 
510
1
 
332
4
 
319

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
019489
90.2%
2963
 
4.5%
3510
 
2.4%
1332
 
1.5%
4319
 
1.5%

Length

2022-04-30T10:09:11.611210image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:11.715245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
019489
90.2%
2963
 
4.5%
3510
 
2.4%
1332
 
1.5%
4319
 
1.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

condition
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
3
14031 
4
5679 
5
1701 
2
 
172
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row5
5th row3

Common Values

ValueCountFrequency (%)
314031
64.9%
45679
26.3%
51701
 
7.9%
2172
 
0.8%
130
 
0.1%

Length

2022-04-30T10:09:11.854248image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:11.959206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
314031
64.9%
45679
26.3%
51701
 
7.9%
2172
 
0.8%
130
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

grade
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.656873178
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:12.077208image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q17
median7
Q38
95-th percentile10
Maximum13
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.175458757
Coefficient of variation (CV)0.1535168116
Kurtosis1.190932077
Mean7.656873178
Median Absolute Deviation (MAD)1
Skewness0.7711032008
Sum165488
Variance1.381703289
MonotonicityNot monotonic
2022-04-30T10:09:12.233692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
78981
41.6%
86068
28.1%
92615
 
12.1%
62038
 
9.4%
101134
 
5.2%
11399
 
1.8%
5242
 
1.1%
1290
 
0.4%
429
 
0.1%
1313
 
0.1%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
11
 
< 0.1%
33
 
< 0.1%
429
 
0.1%
5242
 
1.1%
62038
 
9.4%
78981
41.6%
86068
28.1%
92615
 
12.1%
101134
 
5.2%
11399
 
1.8%
ValueCountFrequency (%)
1313
 
0.1%
1290
 
0.4%
11399
 
1.8%
101134
 
5.2%
92615
 
12.1%
86068
28.1%
78981
41.6%
62038
 
9.4%
5242
 
1.1%
429
 
0.1%

sqft_above
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct946
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1788.390691
Minimum290
Maximum9410
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:12.428735image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile850
Q11190
median1560
Q32210
95-th percentile3400
Maximum9410
Range9120
Interquartile range (IQR)1020

Descriptive statistics

Standard deviation828.0909777
Coefficient of variation (CV)0.4630369538
Kurtosis3.402303621
Mean1788.390691
Median Absolute Deviation (MAD)450
Skewness1.446664473
Sum38652488
Variance685734.6673
MonotonicityNot monotonic
2022-04-30T10:09:12.642692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1300212
 
1.0%
1010210
 
1.0%
1200206
 
1.0%
1220192
 
0.9%
1140184
 
0.9%
1400180
 
0.8%
1060178
 
0.8%
1180177
 
0.8%
1340176
 
0.8%
1250174
 
0.8%
Other values (936)19724
91.3%
ValueCountFrequency (%)
2901
< 0.1%
3701
< 0.1%
3801
< 0.1%
3841
< 0.1%
3902
< 0.1%
4101
< 0.1%
4202
< 0.1%
4301
< 0.1%
4401
< 0.1%
4601
< 0.1%
ValueCountFrequency (%)
94101
< 0.1%
88601
< 0.1%
85701
< 0.1%
80201
< 0.1%
78801
< 0.1%
78501
< 0.1%
76801
< 0.1%
74201
< 0.1%
73201
< 0.1%
67201
< 0.1%

sqft_basement
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct306
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291.5090455
Minimum0
Maximum4820
Zeros13126
Zeros (%)60.7%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:12.843478image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3560
95-th percentile1190
Maximum4820
Range4820
Interquartile range (IQR)560

Descriptive statistics

Standard deviation442.5750427
Coefficient of variation (CV)1.518220616
Kurtosis2.715574211
Mean291.5090455
Median Absolute Deviation (MAD)0
Skewness1.577965056
Sum6300385
Variance195872.6684
MonotonicityNot monotonic
2022-04-30T10:09:13.075961image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
013126
60.7%
600221
 
1.0%
700218
 
1.0%
500214
 
1.0%
800206
 
1.0%
400184
 
0.9%
1000149
 
0.7%
900144
 
0.7%
300142
 
0.7%
200108
 
0.5%
Other values (296)6901
31.9%
ValueCountFrequency (%)
013126
60.7%
102
 
< 0.1%
201
 
< 0.1%
404
 
< 0.1%
5011
 
0.1%
6010
 
< 0.1%
651
 
< 0.1%
707
 
< 0.1%
8020
 
0.1%
9021
 
0.1%
ValueCountFrequency (%)
48201
< 0.1%
41301
< 0.1%
35001
< 0.1%
34801
< 0.1%
32601
< 0.1%
30001
< 0.1%
28501
< 0.1%
28101
< 0.1%
27301
< 0.1%
27201
< 0.1%

yr_built
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct116
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1971.005136
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:13.301918image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile1915
Q11951
median1975
Q31997
95-th percentile2011
Maximum2015
Range115
Interquartile range (IQR)46

Descriptive statistics

Standard deviation29.3734108
Coefficient of variation (CV)0.01490275711
Kurtosis-0.6574075047
Mean1971.005136
Median Absolute Deviation (MAD)23
Skewness-0.4698053988
Sum42599334
Variance862.7972622
MonotonicityNot monotonic
2022-04-30T10:09:13.864922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014559
 
2.6%
2006454
 
2.1%
2005450
 
2.1%
2004433
 
2.0%
2003422
 
2.0%
2007417
 
1.9%
1977417
 
1.9%
1978387
 
1.8%
1968381
 
1.8%
2008367
 
1.7%
Other values (106)17326
80.2%
ValueCountFrequency (%)
190087
0.4%
190129
 
0.1%
190227
 
0.1%
190346
0.2%
190445
0.2%
190574
0.3%
190692
0.4%
190765
0.3%
190886
0.4%
190994
0.4%
ValueCountFrequency (%)
201538
 
0.2%
2014559
2.6%
2013201
 
0.9%
2012170
 
0.8%
2011130
 
0.6%
2010143
 
0.7%
2009230
1.1%
2008367
1.7%
2007417
1.9%
2006454
2.1%

lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct5034
Distinct (%)23.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.56005252
Minimum47.1559
Maximum47.7776
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:14.104923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum47.1559
5-th percentile47.3103
Q147.471
median47.5718
Q347.678
95-th percentile47.74964
Maximum47.7776
Range0.6217
Interquartile range (IQR)0.207

Descriptive statistics

Standard deviation0.1385637102
Coefficient of variation (CV)0.002913447377
Kurtosis-0.6763130016
Mean47.56005252
Median Absolute Deviation (MAD)0.1049
Skewness-0.4852704765
Sum1027915.415
Variance0.0191999018
MonotonicityNot monotonic
2022-04-30T10:09:14.335922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47.662417
 
0.1%
47.532217
 
0.1%
47.684617
 
0.1%
47.549117
 
0.1%
47.695516
 
0.1%
47.688616
 
0.1%
47.671116
 
0.1%
47.540215
 
0.1%
47.684215
 
0.1%
47.690415
 
0.1%
Other values (5024)21452
99.3%
ValueCountFrequency (%)
47.15591
< 0.1%
47.15931
< 0.1%
47.16221
< 0.1%
47.16471
< 0.1%
47.17641
< 0.1%
47.17751
< 0.1%
47.17762
< 0.1%
47.17951
< 0.1%
47.18031
< 0.1%
47.18081
< 0.1%
ValueCountFrequency (%)
47.77763
< 0.1%
47.77753
< 0.1%
47.77741
 
< 0.1%
47.77723
< 0.1%
47.77712
 
< 0.1%
47.7772
 
< 0.1%
47.77693
< 0.1%
47.77682
 
< 0.1%
47.77676
< 0.1%
47.77664
< 0.1%

long
Real number (ℝ)

HIGH CORRELATION

Distinct752
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-122.2138964
Minimum-122.519
Maximum-121.315
Zeros0
Zeros (%)0.0%
Negative21613
Negative (%)100.0%
Memory size169.0 KiB
2022-04-30T10:09:14.566920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-122.519
5-th percentile-122.387
Q1-122.328
median-122.23
Q3-122.125
95-th percentile-121.979
Maximum-121.315
Range1.204
Interquartile range (IQR)0.203

Descriptive statistics

Standard deviation0.1408283424
Coefficient of variation (CV)-0.001152310388
Kurtosis1.049500887
Mean-122.2138964
Median Absolute Deviation (MAD)0.101
Skewness0.8850529834
Sum-2641408.943
Variance0.01983262202
MonotonicityNot monotonic
2022-04-30T10:09:14.796920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-122.29116
 
0.5%
-122.3111
 
0.5%
-122.362104
 
0.5%
-122.291100
 
0.5%
-122.36399
 
0.5%
-122.37299
 
0.5%
-122.28898
 
0.5%
-122.35796
 
0.4%
-122.28495
 
0.4%
-122.36594
 
0.4%
Other values (742)20601
95.3%
ValueCountFrequency (%)
-122.5191
 
< 0.1%
-122.5151
 
< 0.1%
-122.5141
 
< 0.1%
-122.5121
 
< 0.1%
-122.5112
< 0.1%
-122.5092
< 0.1%
-122.5071
 
< 0.1%
-122.5061
 
< 0.1%
-122.5053
< 0.1%
-122.5042
< 0.1%
ValueCountFrequency (%)
-121.3152
< 0.1%
-121.3161
< 0.1%
-121.3191
< 0.1%
-121.3211
< 0.1%
-121.3251
< 0.1%
-121.3522
< 0.1%
-121.3591
< 0.1%
-121.3642
< 0.1%
-121.4021
< 0.1%
-121.4031
< 0.1%

sqft_living15
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct777
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1986.552492
Minimum399
Maximum6210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:15.017921image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum399
5-th percentile1140
Q11490
median1840
Q32360
95-th percentile3300
Maximum6210
Range5811
Interquartile range (IQR)870

Descriptive statistics

Standard deviation685.3913043
Coefficient of variation (CV)0.3450154512
Kurtosis1.59709581
Mean1986.552492
Median Absolute Deviation (MAD)410
Skewness1.108181276
Sum42935359
Variance469761.2399
MonotonicityNot monotonic
2022-04-30T10:09:15.232965image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1540197
 
0.9%
1440195
 
0.9%
1560192
 
0.9%
1500181
 
0.8%
1460169
 
0.8%
1580167
 
0.8%
1610166
 
0.8%
1720166
 
0.8%
1800166
 
0.8%
1620165
 
0.8%
Other values (767)19849
91.8%
ValueCountFrequency (%)
3991
 
< 0.1%
4602
 
< 0.1%
6202
 
< 0.1%
6701
 
< 0.1%
6902
 
< 0.1%
7002
 
< 0.1%
7102
 
< 0.1%
7202
 
< 0.1%
7408
< 0.1%
7503
 
< 0.1%
ValueCountFrequency (%)
62101
 
< 0.1%
61101
 
< 0.1%
57906
< 0.1%
56101
 
< 0.1%
56001
 
< 0.1%
55001
 
< 0.1%
53801
 
< 0.1%
53401
 
< 0.1%
53301
 
< 0.1%
52201
 
< 0.1%

sqft_lot15
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct8689
Distinct (%)40.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12768.45565
Minimum651
Maximum871200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T10:09:15.459963image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum651
5-th percentile1999.2
Q15100
median7620
Q310083
95-th percentile37062.8
Maximum871200
Range870549
Interquartile range (IQR)4983

Descriptive statistics

Standard deviation27304.17963
Coefficient of variation (CV)2.138408933
Kurtosis150.76311
Mean12768.45565
Median Absolute Deviation (MAD)2505
Skewness9.506743247
Sum275964632
Variance745518225.3
MonotonicityNot monotonic
2022-04-30T10:09:15.680922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000427
 
2.0%
4000357
 
1.7%
6000289
 
1.3%
7200211
 
1.0%
4800145
 
0.7%
7500142
 
0.7%
8400116
 
0.5%
3600111
 
0.5%
4500111
 
0.5%
5100109
 
0.5%
Other values (8679)19595
90.7%
ValueCountFrequency (%)
6511
 
< 0.1%
6591
 
< 0.1%
6601
 
< 0.1%
7482
< 0.1%
7504
< 0.1%
7551
 
< 0.1%
7571
 
< 0.1%
7581
 
< 0.1%
7881
 
< 0.1%
7941
 
< 0.1%
ValueCountFrequency (%)
8712001
< 0.1%
8581321
< 0.1%
5606171
< 0.1%
4382131
< 0.1%
4347281
< 0.1%
4255811
< 0.1%
4229671
< 0.1%
4119621
< 0.1%
3920402
< 0.1%
3868121
< 0.1%

house_age
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing320
Missing (%)1.5%
Memory size169.0 KiB
old_house
14616 
new_house
6677 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowold_house
2nd rowold_house
3rd rowold_house
4th rowold_house
5th rowold_house

Common Values

ValueCountFrequency (%)
old_house14616
67.6%
new_house6677
30.9%
(Missing)320
 
1.5%

Length

2022-04-30T10:09:15.885957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:15.989921image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
old_house14616
68.6%
new_house6677
31.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

dormitory_type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
house
18641 
apartment
2760 
studio
 
212

Length

Max length9
Median length5
Mean length5.520612594
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhouse
2nd rowhouse
3rd rowapartment
4th rowhouse
5th rowhouse

Common Values

ValueCountFrequency (%)
house18641
86.2%
apartment2760
 
12.8%
studio212
 
1.0%

Length

2022-04-30T10:09:16.135918image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:16.255919image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
house18641
86.2%
apartment2760
 
12.8%
studio212
 
1.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

condition_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
regular
19710 
good
 
1701
bad
 
202

Length

Max length7
Median length7
Mean length6.726507195
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowregular
2nd rowregular
3rd rowregular
4th rowgood
5th rowregular

Common Values

ValueCountFrequency (%)
regular19710
91.2%
good1701
 
7.9%
bad202
 
0.9%

Length

2022-04-30T10:09:16.403917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:16.528918image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
regular19710
91.2%
good1701
 
7.9%
bad202
 
0.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

price_tier
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
tier 2
5460 
tier 1
5404 
tier 3
5376 
tier 4
5373 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtier 1
2nd rowtier 3
3rd rowtier 1
4th rowtier 3
5th rowtier 3

Common Values

ValueCountFrequency (%)
tier 25460
25.3%
tier 15404
25.0%
tier 35376
24.9%
tier 45373
24.9%

Length

2022-04-30T10:09:16.657928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T10:09:16.760917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
tier21613
50.0%
25460
 
12.6%
15404
 
12.5%
35376
 
12.4%
45373
 
12.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-04-30T10:09:03.065136image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:20.713011image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:24.391594image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:27.663233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:30.707291image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:33.820884image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:37.759624image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:40.786678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:43.684679image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:46.901922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:49.786925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:53.626924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:57.163087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:00.129138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:03.277150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:21.079004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:24.610598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:27.889754image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:30.925813image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:34.073373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:37.985623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:41.021677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:43.902675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:47.135926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:50.023923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:53.958925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:57.384089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:00.349139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:03.479145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:21.418997image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:24.798123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:28.109753image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:31.154807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:34.285377image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:38.207621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:41.226674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:44.129682image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:47.343926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:50.240922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:54.183925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:57.592090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:00.560141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:03.693137image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:21.760529image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:25.025120image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:28.335749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:31.382810image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:34.506895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:38.408146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:41.442674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:44.343675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:47.556969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:50.442929image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:54.409934image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:57.812131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:00.774140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:03.897137image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:22.102530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:25.255129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:28.552771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:31.617811image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:34.722006image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:38.655154image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:41.647687image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:44.540679image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:47.753967image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:50.652923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:54.672964image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:58.032086image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:00.984139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:04.110161image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:22.365530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:25.815159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:28.772760image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:31.851810image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:34.924515image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:38.863145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:41.855675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:44.743679image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:47.965925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:50.874923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:54.888506image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:58.254088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:01.201146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:04.303138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:22.592526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:26.018159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:28.997752image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:32.089331image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:35.505520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:39.087150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:42.073674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:44.945928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:48.177923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:51.089925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:55.104507image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:58.447109image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:01.386146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:04.505140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:22.807048image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:26.223162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:29.208759image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:32.291882image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:35.701516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:39.290146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:42.267678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:45.161924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:48.385925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:51.322926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:55.309505image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:58.649090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:01.620138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:04.693710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:23.030051image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:26.408160image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:29.414750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:32.492882image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:35.913517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:39.465150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:42.460674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:45.360969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:48.565938image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:51.593924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:55.536541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:58.849087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:01.817142image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:04.891707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:23.254050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:26.615181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:29.607753image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:32.709929image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:36.197520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:39.673674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:42.662675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:45.911968image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:48.758924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:51.884935image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:55.727537image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:59.055094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:02.030141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:05.082755image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:23.470050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:26.831188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:29.832755image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:32.961887image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:36.505517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:39.889677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:42.874672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:46.112924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:48.964921image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:52.244924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:55.946503image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:59.274090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:02.246140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:05.304720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:23.688597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:27.053706image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:30.060750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:33.197886image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:36.832518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:40.162673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:43.089677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:46.320971image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:49.193925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:52.639925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:56.535564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:59.469087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:02.441140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:05.506712image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:23.948603image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:27.269237image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:30.287754image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:33.421892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:37.168044image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:40.381673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:43.294673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:46.527923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:49.396925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:53.035926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:56.746559image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:59.677612image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:02.655139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:05.705710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:24.178596image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:27.459239image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:30.502749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:33.619889image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:37.471106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:40.593674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:43.487676image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:46.727922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:49.587923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:53.344926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:56.949557image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:08:59.906138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T10:09:02.855181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-04-30T10:09:16.920917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-30T10:09:17.272919image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-30T10:09:17.620958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-30T10:09:18.036923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-30T10:09:18.373918image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-30T10:09:06.456712image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-30T10:09:07.098709image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-04-30T10:09:07.447711image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

datepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtlatlongsqft_living15sqft_lot15house_agedormitory_typecondition_typeprice_tier
02014-10-13221900.0031.00118056501.00003711800195547.51-122.2613405650old_househouseregulartier 1
12014-12-09538000.0032.25257072422.0000372170400195147.72-122.3216907639old_househouseregulartier 3
22015-02-25180000.0021.00770100001.0000367700193347.74-122.2327208062old_houseapartmentregulartier 1
32014-12-09604000.0043.00196050001.0000571050910196547.52-122.3913605000old_househousegoodtier 3
42015-02-18510000.0032.00168080801.00003816800198747.62-122.0518007503old_househouseregulartier 3
52014-05-121225000.0044.5054201019301.000031138901530200147.66-122.004760101930new_househouseregulartier 4
62014-06-27257500.0032.25171568192.00003717150199547.31-122.3322386819new_househouseregulartier 1
72015-01-15291850.0031.50106097111.00003710600196347.41-122.3116509711old_househouseregulartier 1
82015-04-15229500.0031.00178074701.0000371050730196047.51-122.3417808113old_househouseregulartier 1
92015-03-12323000.0032.50189065602.00003718900200347.37-122.0323907570new_househouseregulartier 2

Last rows

datepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtlatlongsqft_living15sqft_lot15house_agedormitory_typecondition_typeprice_tier
216032014-08-25507250.0032.50227055362.00003822700200347.54-121.8822705731new_househouseregulartier 3
216042015-01-26429000.0032.00149011263.00003814900201447.57-122.2914001230new_househouseregulartier 2
216052014-10-14610685.0042.50252060232.00003925200201447.51-122.1725206023new_househouseregulartier 3
216062015-03-261007500.0043.50351072002.0000392600910200947.55-122.4020506200new_househouseregulartier 4
216072015-02-19475000.0032.50131012942.0000381180130200847.58-122.4113301265new_househouseregulartier 3
216082014-05-21360000.0032.50153011313.00003815300200947.70-122.3515301509new_househouseregulartier 2
216092015-02-23400000.0042.50231058132.00003823100201447.51-122.3618307200new_househouseregulartier 2
216102014-06-23402101.0020.75102013502.00003710200200947.59-122.3010202007new_houseapartmentregulartier 2
216112015-01-16400000.0032.50160023882.00003816000200447.53-122.0714101287new_househouseregulartier 2
216122014-10-15325000.0020.75102010762.00003710200200847.59-122.3010201357new_houseapartmentregulartier 2